Methods and apparatus to project ratings for future broadcasts of media

ABSTRACT

Methods, apparatus, systems and articles of manufacture are disclosed to project ratings for future broadcasts of media. Disclosed example methods include normalizing, with a processor, audience measurement data corresponding to media exposure data, social media exposure data and programming information associated with a future quarter to determine normalized audience measurement data. Disclosed example methods also include classifying a media asset based on the programming information to determine a media asset classification. Disclosed example methods also include building, with the processor, a projection model based on a first subset of the normalized audience measurement data, the first subset of the normalized audience measurement data associated with a first time frame relative to the future quarter, the first subset of the normalized audience measurement data based on the media asset classification, and applying, with the processor, the programming information to the projection model to project ratings for the media asset.

RELATED APPLICATION

This patent claims the benefit of, and priority from, U.S. ProvisionalPatent Application No. 62/083,716, filed Nov. 24, 2014, entitled“Methods and Apparatus to Predict TV Rating Lift.” U.S. ProvisionalPatent Application No. 62/083,716 is hereby incorporated by reference inits entirety.

FIELD OF THE DISCLOSURE

This disclosure relates generally to audience measurement, and, moreparticularly, to methods and apparatus to project ratings for futurebroadcasts of media.

BACKGROUND

Audience measurement of media (e.g., content and/or advertisementspresented by any type of medium such as television, in theater movies,radio, Internet, etc.) is typically carried out by monitoring mediaexposure of panelists that are statistically selected to representparticular demographic groups. Audience measurement companies, such asThe Nielsen Company (US), LLC, enroll households and persons toparticipate in measurement panels. By enrolling in these measurementpanels, households and persons agree to allow the corresponding audiencemeasurement company to monitor their exposure to informationpresentations, such as media output via a television, a radio, acomputer, etc. Using various statistical methods, the collected mediaexposure data is processed to determine the size and/or demographiccomposition of the audience(s) for media of interest. The audience sizeand/or demographic information is valuable to, for example, advertisers,broadcasters, content providers, manufacturers, retailers, productdevelopers, and/or other entities. For example, audience size anddemographic information is a factor in the placement of advertisements,in valuing commercial time slots during a particular program and/orgenerating ratings for piece(s) of media.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an example system for audience measurement analysisimplemented in accordance with the teachings of this disclosure toproject ratings for future broadcasts of media.

FIG. 2 is an example upfront programming schedule that may be used bythe example central facility of FIG. 1 to determine media asset(s) forwhich to project ratings.

FIG. 3 is an example data table that may be used by the example centralfacility of FIG. 1 to store raw data variables in the example raw datadatabase of FIG. 1.

FIG. 4 is an example block diagram of an example implementation of thedata transformer of FIG. 1.

FIG. 5 is an example data table that may be used by the example datatransformer of FIGS. 1 and/or 4 to transform ratings data variables intoratings predictive features.

FIG. 6 is an example data table that may be used by the example datatransformer of FIGS. 1 and/or 4 to transform program attributes datavariables into program attributes predictive features.

FIG. 7 is an example data table that may be used by the example datatransformer of FIGS. 1 and/or 4 to transform social media data variablesinto social media predictive features.

FIG. 8 is an example data table that may be used by the example datatransformer of FIGS. 1 and/or 4 to transform spending data variablesinto advertisement spending predictive features.

FIG. 9 is an example data table that may be used by the example datatransformer of FIGS. 1 and/or 4 to transform universe estimates datavariables into universe estimate predictive features.

FIG. 10 is a flowchart representative of example machine-readableinstructions that may be executed by the example central facility ofFIG. 1 to project ratings for future broadcasts of media.

FIG. 11 is a flowchart representative of example machine-readableinstructions that may be executed by the example media mapper of FIG. tocatalog related media.

FIG. 12 is a flowchart representative of example machine-readableinstructions that may be executed by the example data transformer ofFIGS. 1 and/or 4 to transform raw audience measurement data topredictive features.

FIG. 13 is a flowchart representative of example machine-readableinstructions that may be executed by the example central facility ofFIG. 1 to project ratings for future broadcasts of media.

FIG. 14 is an example schema that may be used by the example centralfacility of FIG. 1 to determine predictive features associated with afirst module.

FIG. 15 is an example schema that may be used by the example centralfacility of FIG. 1 to determine predictive features associated with asecond module.

FIG. 16 is an example schema that may be used by the example centralfacility of FIG. 1 to determine predictive features associated with athird module.

FIG. 17 is a flowchart representative of example machine-readableinstructions that may be executed by the example model builder of FIG. 1to project ratings for future broadcasts of media.

FIG. 18 is a flowchart representative of example machine-readableinstructions that may be executed by the example future ratingsprojector of FIG. 1 to project ratings for future broadcasts of media.

FIG. 19 is a block diagram of an example processing platform structuredto execute the example machine-readable instructions of FIGS. 10-16and/or 17 to implement the example central facility and/or the exampledata translator of FIGS. 1 and/or 4.

Wherever possible, the same reference numbers will be used throughoutthe drawing(s) and accompanying written description to refer to the sameor like parts.

DETAILED DESCRIPTION

Examples disclosed herein facilitate projecting ratings for futurebroadcasts of media. Disclosed examples enable estimating televisionratings for households that will tune to (or persons that will beexposed to) a program in a future quarter. For example, near-termprojections enable estimating the television ratings for a program thatwill be broadcast within two quarters of the current quarter, whileupfront projections enable estimating the television ratings for aprogram that will be broadcast in three or more quarters from thecurrent quarter.

Exposure information (e.g., ratings) may be useful for determining amarketing campaign and/or evaluating the effectiveness of a marketingcampaign. For example, an advertiser who wants exposure of their asset(e.g., a product, a service, etc.) to reach a specific audience willplace advertisements in media (e.g., a television program) whoseaudience represents the characteristics of the target market. In someexamples, networks determine the cost of including an advertisement intheir media based on the ratings of the media. For example, a highrating for a television program represents a large number of audiencemembers who tuned to (or were exposed to) the television program. Insuch instances, the larger the audience of a television program (e.g., ahigher rating), the more networks can charge for advertisements duringthe program.

In the North American television industry, an upfront is a meetinghosted at the start of important advertising sales periods by televisionnetwork executives, attended by the press and major advertisers. It isso named because of its main purpose, to allow marketers to buytelevision commercial airtime “up front,” or several months before atelevision season begins. In some examples disclosed herein, an upfrontprojection model is developed to predict upfront TV ratings. Forexample, examples disclosed herein include a central facility that isoperated by an audience measurement entity (AME). In some examples, thecentral facility and/or the AME may collect measurement information(e.g., raw data inputs) including historical TV ratings (e.g., NPowerhistorical TV ratings), social media information (e.g., informationcollected from social media services such as Twitter, Google+, Facebook,Instagram, etc.), genre information (e.g., genre data derived fromNPower genre data), sponsored-media spending (e.g., ad-spending dataprovided by, for example, a media provider), etc. NPower is an exampleplatform of historical TV ratings developed by The Nielsen Company (US),LLC. The NPower platform includes related applications and tools thatprovide measurement of audience measurements in the US and globally,such as National TV Toolbox. In some examples, the central facilityincorporates additional information, such as TV brand effects (TVBE)information. TVBE is an example metric developed by The Nielsen Company(US), LLC to measure a TV advertisement's “breakthrough” or “resonance.”

In some disclosed examples, the central facility develops models topredict upfront TV ratings using telecast-level data. In some suchexamples, the central facility generates the predictions for eachtelecast. The telecast predictions may be aggregated to provideprogram-level and/or network-level predictions. Separate models may alsobe developed at a program level. In some examples, the central facilitymines historical database(s) to identify programs that can be used toimprove the predictions of new programs (e.g., relevant programs). Therelevancy of past programs for predicting future programs is measured inseveral dimensions, including, for example, program content, programtitles, network line-up (including day parts), etc. In some examples,historical TV ratings have been shown to significantly improve theaccuracy of such prediction models, and, in some instances, haveaccounted for an average 80% of the explanatory power of the model. Theexample central facility may transform the raw data inputs (e.g.,historical TV ratings, social media information, genre information,sponsored-media spending information and/or TVBE, etc.) into predictivevariables/features that are used as predictors in the predictive models.In some examples, the central facility identifies (e.g., automaticallyidentifies) the predictive features among a pool of many features, aswell as the most efficient techniques and/or algorithms to utilize thesefeatures. Example techniques and/or algorithms used by the centralfacility include statistical analysis (e.g., regression models,time-series models), data and text mining, machine learning modelsand/or agent-based models. In some examples, the process of data miningand deep learning is automated to minimize (e.g., reduce)manual/subjective input and to reduce the amount of time required.

In some examples, the central facility processes the data over 2-4 weeksto build the predictive models. In some such examples, once built, thecentral facility applies the model over a 2-day period to predict newdata. The longevity of the model (e.g., how often the model needs to bere-calibrated) depends on how fast the market dynamics change. Forexample, the model may be re-calibrated once-a-year.

In some examples, when historical ratings are used, the central facilityuses a gap of 1 quarter (13 weeks) when developing the projectionmodels. This gap, though, makes it more challenging to achieve betteraccuracy, but nevertheless is desirable for mid-term projections, suchas the case of upfront projection. To measure the developed model'sperformance, a mean percentage error metric (e.g., percent(actual/forecast)—1) and/or R-sq metric (e.g., measured between actualratings and predicted ratings) may be used.

In some examples, the central facility includes all program informationwhen developing the projection models. In some such examples, theprojection model includes only historical ratings (e.g., informationcollected via the NPower platform). In some examples, the projectionmodel is tested using a hold-out test data set. Hold-out test data setswere not used to train the model, and, thus, are better suited tomeasure how the projection models perform for the purpose of ratingpredictions.

FIG. 1 is a diagram of an example environment in which an example system100 constructed in accordance with the teachings of this disclosureoperates to project future ratings for media of interest. The examplesystem 100 of FIG. 1 includes one or more audience measurement system(s)105, an example client 170 and an example central facility 125 tofacilitate projecting future ratings for media of interest in accordancewith the teachings of this disclosure. In the illustrated example ofFIG. 1, the central facility 125 estimates a percentage of a universe ofTV households (or other specified group) that will tune to a program ina future period (e.g., in the next quarter (e.g., fiscal quarter), inthree quarters, etc.) by generating ratings projection model(s) basedon, for example, historical ratings, program characteristics, socialmedia indicators, advertisement spending, programming schedules, etc.

The example system 100 of FIG. 1 includes the one or more audiencemeasurement system(s) 105 to collect audience measurement data 110 frompanelists and non-panelists. The example audience measurement system(s)105 of FIG. 1 collect panelist media measurement data 110A via, forexample, people meters operating in statistically-selected households,set-top boxes and/or other media devices (e.g., such as digital videorecorders, personal computers, tablet computers, smartphones, etc.)capable of monitoring and returning monitored data for mediapresentations, etc. The example panelist media measurement data 110A ofFIG. 1 includes media exposure data such as live exposure data, delayedexposure data (e.g., relative to time-shifted viewing of media via, forexample, a digital video recorder and/or video on-demand), mediaperformance data, such as TV ratings (e.g., historical TV ratings),program characteristics (e.g., attributes), such as broadcastday-of-week information, broadcast time information, originatorinformation (e.g., a network or channel that broadcasts the media),genre information, universe estimates (e.g., an estimated number ofactual households or people from which a sample will be taken and towhich data from the sample will be projected), etc. In some examples,the panelist media measurement data 110A is associated with demographicinformation (e.g., gender, age, income, etc.) of the panelists exposedto the media.

As used herein, the term “media” includes any type of content and/oradvertisement delivered via any type of distribution medium. Thus, mediaincludes television programming or advertisements, radio programming oradvertisements, movies, web sites, streaming media, etc.

Example methods, apparatus, and articles of manufacture disclosed hereinmonitor media presentations at media devices. Such media devices mayinclude, for example, Internet-enabled televisions, personal computers,Internet-enabled mobile handsets (e.g., a smartphone), video gameconsoles (e.g., Xbox®, PlayStation®), tablet computers (e.g., an iPad®),digital media players (e.g., a Roku® media player, a Slingbox®, etc.),etc. In some examples, media monitoring information is aggregated todetermine ownership and/or usage statistics of media devices, relativerankings of usage and/or ownership of media devices, types of uses ofmedia devices (e.g., whether a device is used for browsing the Internet,streaming media from the Internet, etc.), and/or other types of mediadevice information. In examples disclosed herein, monitoring informationincludes, but is not limited to, media identifying information (e.g.,media-identifying metadata, codes, signatures, watermarks, and/or otherinformation that may be used to identify presented media), applicationusage information (e.g., an identifier of an application, a time and/orduration of use of the application, a rating of the application, etc.),and/or user-identifying information (e.g., demographic information, auser identifier, a panelist identifier, a username, etc.).

The example audience measurement system(s) 105 of FIG. 1 also collectsocial media activity data 110B related to media via, for example,social media servers that provide social media services to users of thesocial media server. As used herein, the term social media services isdefined to be a service provided to users to enable users to shareinformation (e.g., text, images, data, etc.) in a virtual communityand/or network. Example social media services may include, for example,Internet forums (e.g., a message board), blogs, micro-blogs (e.g.,Twitter®), social networks (e.g., Facebook®, LinkedIn, Instagram, etc.),etc. For example, the audience measurement system(s) 105 may monitorsocial media messages communicated via social media services andidentify media-exposure social media messages (e.g., social mediamessages that reference at least one media asset (e.g., media and/or amedia event)). The example audience measurement system(s) 105 may filterthe media-exposure social media messages for media-exposure social mediamessages of interest (e.g., social media messages that reference mediaof interest).

The example social media activity data 110B of FIG. 1 includes one ormore of message identifying information (e.g., a message identifier, amessage author, etc.), timestamp information indicative of when thesocial media message was posted and/or viewed, the content of the socialmedia message and an identifier of the media asset referenced in themedia-exposure social media message. In some examples, the audiencemeasurement system(s) 105 may process the media-exposure social mediamessages of interest and aggregate information related to the socialmedia messages. For example, the audience measurement system(s) 105 maydetermine a count of the media-exposure social media messages ofinterest, may determine a number of unique authors who posted themedia-exposure social media messages of interest, may determine a numberof impressions of (e.g., exposure to) the media-exposure social mediamessages of interest, etc.

In the illustrated example of FIG. 1, the audience measurement system(s)105 send the audience measurement data 110 to the central facility 125via an example network 115. The example network 115 of the illustratedexample of FIG. 1 is the Internet. However, the example network 115 maybe implemented using any suitable wired and/or wireless network(s)including, for example, one or more data buses, one or more Local AreaNetworks (LANs), one or more wireless LANs, one or more cellularnetworks, one or more private networks, one or more public networks,etc. The example network 115 enables the central facility 125 to be incommunication with the audience measurement system(s) 105. As usedherein, the phrase “in communication,” including variances therefore,encompasses direct communication and/or indirect communication throughone or more intermediary components and does not require direct physical(e.g., wired) communication and/or constant communication, but ratherincludes selective communication at periodic or aperiodic intervals, aswell as one-time events.

In the illustrated example, the central facility 125 is operated by anaudience measurement entity (AME) 120 (sometimes referred to as an“audience analytics entity” (AAE)). The example AME 120 of theillustrated example of FIG. 1 is an entity such as The Nielsen Company(US), LLC that monitors and/or reports exposure to media and operates asa neutral third party. That is, in the illustrated example, the audiencemeasurement entity 120 does not provide media (e.g., content and/oradvertisements) to end users. This un-involvement with the mediaproduction and/or delivery ensures the neutral status of the audiencemeasurement entity 120 and, thus, enhances the trusted nature of thedata the AME 120 collects and processes. The reports generated by theaudience measurement entity may identify aspects of media usage, such asthe number of people who are watching television programs andcharacteristics of the audiences (e.g., demographic information of whois watching the television programs, when they are watching thetelevision programs, etc.).

The example AME 120 of FIG. 1 operates the central facility 125 tofacilitate future projections of a media asset of interest. As usedherein, a media asset of interest is a particular media program (e.g.,identified via a program identifier such as a title, an alphanumericcode, season and episode numbers, etc.) that is being analyzed (e.g.,for a report). In the illustrated example of FIG. 1, the centralfacility 125 generates one or more reports at the request of an exampleclient 170 (e.g., a television network, an advertiser, etc.). In theillustrated example, the client 170 requests projections for media ofinterest that will be broadcast in the near-term (e.g., within twoquarters from the current quarter) or at a later quarter based on, forexample, historical ratings, program characteristics, social mediaindicators, advertisement spending, programming schedules, etc. In theillustrated example, the client 170 provides the AME 120 an exampleprogramming schedule 175 that includes scheduling information for thequarter of interest (e.g., the quarter for which the projections arebeing generated). In some examples, the programming schedule 175indicates specific information (e.g., program characteristics) regardingthe media asset of interest such as whether the media asset is a series(e.g., a season premier, a repeat episode, a new episode, etc.), aspecial (e.g., a one-time event such as a movie, a sporting event,etc.), etc. In some examples, the programming schedule 175 indicatesgeneral information, such as a program title and broadcast times of themedia. An example upfront programming schedule 200 of the illustratedexample of FIG. 2 illustrates an example programming schedule 175 for aquarter of interest that may be provided by the client 170.

In some examples, the client 170 may use the reports provided by theexample central facility 125 to analyze exposure to media and takeactions accordingly. For example, a television network may increase thecost of an advertising spot (e.g., commercial advertising time eitheravailable for sale or purchase from network) for media associated withrelatively greater viewership than other programs, may determine toincrease the number of episodes of the media, etc. In some examples, theclient 170 (e.g., the television network) may determine whether todiscontinue producing a media program associated with relatively lowerviewership, reduce the cost of an advertising spot for media that may beprojected to have lower ratings, etc. As described above, it isbeneficial for a client (e.g., a television network) to accuratelyproject ratings for the media of asset since the client may have to payrestitution to an advertiser if the projected ratings are higher thanthe actual ratings and, thus, the client charted too much for theadvertisement spot. Additionally or alternatively, the client may valuean advertisement spot too low and, thus, not maximize its gains from themedia.

The central facility 125 of the illustrated example includes a serverand/or database that collects and/or receives audience measurement datarelated to media assets (e.g., media and/or media events) and projectsfuture ratings (e.g., near-term ratings or upfront ratings) for themedia assets of interest. In some examples, the central facility 125 isimplemented using multiple devices and/or the audience measurementsystem(s) 105 is (are) implemented using multiple devices. For example,the central facility 125 and/or the audience measurement system(s) 105may include disk arrays and/or multiple workstations (e.g., desktopcomputers, workstation servers, laptops, etc.) in communication with oneanother. In the illustrated example, the central facility 125 is incommunication with the audience measurement system(s) 105 via one ormore wired and/or wireless networks represented by the network 115.

The example central facility 125 of the illustrated example of FIG. 1processes the audience measurement data 110 returned by the audiencemeasurement system(s) 105 to predict time-shifted exposure to media. Forexample, the central facility 125 may process the audience measurementdata 110 to determine a relationship between predictive features(sometimes referred to herein as “variables,” “predictors” or “factors”)identified from the audience measurement data 110 and measured ratingsto build one or more projection models. For example, the centralfacility 125 may generate a first projection model to project ratingsfor media that will be broadcast in one or two quarters (e.g., anear-term projection model) and/or may generate a second projectionmodel to project ratings for media that will be broadcast in three ormore quarters from the current quarter. The example central facility 125may then apply data associated with the media asset of interest and thequarter of interest to a projection model to determine a ratingsprojection for the media asset.

In the illustrated example of FIG. 1, the central facility 125 includesan example data interface 130, an example raw data database 135, anexample media mapper 137, an example media catalog 139, an example datatransformer 140, an example predicted features data store 145, anexample model builder 150, an example models data store 155 and anexample future ratings projector 160. In the illustrated example of FIG.1, the example central facility 125 includes the example data interface130 to provide an interface between the network 115 and the centralfacility 125. For example, the data interface 130 may be a wired networkinterface, a wireless network interface, a Bluetooth® network interface,etc. and may include the associated software and/or libraries needed tofacilitate communication between the network 115 and the centralfacility 125. In the illustrated example of FIG. 1, the data interface130 receives the audience measurement data 110 returned by the exampleaudience measurement system(s) 105 of FIG. 1. In the illustratedexample, the data interface 130 of FIG. 1 also receives the programmingschedule 175 provided by the client 170 of FIG. 1. The example datainterface 130 records the audience measurement data 110 and theprogramming schedule 175 in the example raw data database 135.

In the illustrated example of FIG. 1, the example central facility 125includes the example raw data database 135 to record data (e.g., theexample audience measurement data 110, the programming schedule 175,etc.) provided by the audience measurement system(s) 105 and/or theclient 170 via the example data interface 130. An example data table 200of the illustrated example of FIG. 2 illustrates example raw datavariables that may be recorded in the example raw data database 135. Theexample raw data database 135 may be implemented by a volatile memory(e.g., a Synchronous Dynamic Random Access Memory (SDRAM), DynamicRandom Access Memory (DRAM), RAMBUS Dynamic Random Access Memory(RDRAM), etc.) and/or a non-volatile memory (e.g., flash memory). Theexample raw data database 135 may additionally or alternatively beimplemented by one or more double data rate (DDR) memories, such as DDR,DDR2, DDR3, mobile DDR (mDDR), etc. The example raw data database 135may additionally or alternatively be implemented by one or more massstorage devices such as hard disk drive(s), compact disk drive(s),digital versatile disk drive(s), etc. While in the illustrated examplethe raw data database 135 is illustrated as a single database, the rawdata database 135 may be implemented by any number and/or type(s) ofdatabases.

The example central facility 125 of the illustrated example of FIG. 1combines multiple disparate data sets to enable modeling and assessmentof multiple inputs simultaneously. In the illustrated example of FIG. 1,the central facility 125 includes the example media mapper 137 toidentify and/or determine media referencing the same media and/or mediathat is related. For example, the media mapper 137 may identify areference to a program in the panelist media measurement data 110A by afirst name (e.g., “How To Run A Steakhouse”), and may identify a socialmedia message in the social media activity data 110B referencing thesame program by a second name (e.g., “#HTRAB”). In such instances, theexample media mapper 137 maps the first name to the second name. In someexamples, the media mapper 137 may identify a third name included in theaudience measurement data 110 that includes a typographical error in theprogram name (e.g., “How Too Run A Steakhouse”). In such instances, theexample media mapper 137 maps the first name, the second name and thethird name to the same program via, for example, a media identifier(e.g., “01234”).

In some examples, the media mapper 137 may determine that a firstprogram and a second program are not referencing the same program, butare related to each other. For example, the second program may be aspin-off of the first program. The example media mapper 137 records themedia mappings in the example media catalog 139. The example mediamapper 137 uses title names to identify and/or determine mediareferencing the same media and/or media that is related. However, anyother technique of mapping related media may additionally oralternatively be used. For example, the media mapper 137 may parse theraw data database 135 and identify related media based on broadcast dayand times (e.g., Tuesday, 8:00 pm), media director(s), charactername(s), actor and actress name(s), etc.

In the illustrated example of FIG. 1, the example central facility 125includes the example media catalog 139 to record mappings provided bythe example media matter 137. The example media catalog 139 may beimplemented by a volatile memory (e.g., an SDRAM, DRAM, RDRAM, etc.)and/or a non-volatile memory (e.g., flash memory). The example mediacatalog 139 may additionally or alternatively be implemented by one ormore DDR memories, such as DDR, DDR2, DDR3, mDDR, etc. The example mediacatalog 139 may additionally or alternatively be implemented by one ormore mass storage devices such as hard disk drive(s), compact diskdrive(s), digital versatile disk drive(s), etc. While in the illustratedexample the media catalog 139 is illustrated as a single database, themedia catalog 139 may be implemented by any number and/or type(s) ofdatabases.

As described above, at least some of the variables are transformed(e.g., modified and/or manipulated) from their raw form in the raw datadatabase 135 to be more meaningfully handled when building theprojection models and projecting ratings for future broadcast(s) ofmedia. For example, raw data may be multiplied, aggregated, averaged,etc., and stored as predictive features (sometimes referred to herein as“transformed,” “sanitized,” “engineered,” “normalized” or “recoded”data) prior to generating the projection models used to project theratings for the media of interest.

In the illustrated example of FIG. 1, the example central facility 125includes the example data transformer 140 to translate the audiencemeasurement data 110 received from the example audience measurementsystem(s) 105 into a form more meaningfully handled by the example modelbuilder 150 (e.g., into predictive features). For example, the datatransformer 140 of FIG. 1 may retrieve and/or query the audiencemeasurement data 110 recorded in the example raw data database 135 andnormalize the disparate data to a common scale. In the illustratedexample, the example data transformer 140 modifies and/or manipulatesaudience measurement data 110 based on the type of data. For example,the data transformer 140 may translate (e.g., map) data that is a stringdata type (e.g., “Day-of-Week” is “Tuesday”) to a Boolean data type(e.g., “Day Tues” is set to true (e.g., “1”)).

As described above and in connection with the example data table 300 ofFIG. 3, the audience measurement data 110 may be in different dataformats and/or different units of measure. For example, programcharacteristic information, such as program title, episode and seasonidentifying information, day of week, broadcast time, broadcast quarter,broadcast network and genre may be stored as string data types. Currentand historical ratings information may be represented via televisionrating scores (e.g., floating data types). Social media indicators(e.g., message identifiers, message timestamps, message content, messageauthor identifiers, message impression information, etc.) may berepresented as string data types. In the illustrated example of FIG. 1,the data transformer 140 normalizes the audience measurement data 110into numerical data types (e.g., Boolean data types, integer data typesand/or floating data types). The example data transformer 140 of FIG. 1records transformed data in the example predictive features data store145.

In the illustrated example of FIG. 1, the example central facility 125includes the example predictive features data store 145 to recordtransformed data provided by the example data transformer 140. Exampledata tables 500, 600, 700, 800 and 900 of the illustrated examples ofFIGS. 5, 6, 7, 8 and 9, respectively, illustrate example translated datavariables that may be recorded in the example predictive features datastore 145. The example predictive features data store 145 may beimplemented by a volatile memory (e.g., an SDRAM, DRAM, RDRAM, etc.)and/or a non-volatile memory (e.g., flash memory). The examplepredictive features data store 145 may additionally or alternatively beimplemented by one or more DDR memories, such as DDR, DDR2, DDR3, mDDR,etc. The example predictive features data store 145 may additionally oralternatively be implemented by one or more mass storage devices such ashard disk drive(s), compact disk drive(s), digital versatile diskdrive(s), etc. While in the illustrated example the predictive featuresdata store 145 is illustrated as a single database, the predictivefeatures data store 145 may be implemented by any number and/or type(s)of databases.

In the illustrated example of FIG. 1, the central facility 125 includesthe example model builder 150 to build one or more projection model(s)that may be used to project ratings for future broadcast(s) of media(e.g., near-term projections, upfront projections, etc.). In theillustrated example, the model builder 150 determines a relationshipbetween one or more predictive features retrieved from the examplepredictive features data store 145 and historical ratings to build oneor more projection model(s).

In the illustrated example of FIG. 1, the model builder 150 utilizes aStochastic Gradient Boosting Machine (GBM) to generate the projectionmodels. GBM is a family of machine-learning techniques for regressionproblems. In the illustrated example, the model builder 150 produces aprediction model in the form of an ensemble of weak prediction models,typically referred to as “decision trees.” By utilizing GBM, the examplemodel builder 150 is able to model complex relationships, including whenusing non-uniform data sources and/or missing information.

In the illustrated example of FIG. 1, the model builder 150 applieshistorical values of one or more predictive features from the predictivefeatures data store 145 to train the model using GBM. However, any othertechnique may additionally or alternatively be used to train a model.For example, the model builder 150 may utilize an equationrepresentative of a projection model that may be built by the examplemodel builder 150. In some such instances, the model builder 150 mayapply historical values of one or more predictive features (X_(i)) fromthe predictive features data store 145 to the representative equation totrain the model to determine value of coefficients (a_(i)) that modifythe predictive features (X_(i)).

In the illustrated example of FIG. 1, the example model builder 150builds different models to project ratings for future broadcast(s) ofmedia based on the quarter of interest and/or future programminginformation available. For example, the model builder 150 may applydifferent sets of predictive features to the GBM to determine thecoefficient values (a_(i)) of the predicative features (X_(i)) based onthe quarter of interest (e.g., one quarter in the future, three quartersin the future, etc.).

In the illustrated example of FIG. 1, the example model builder 150selects the predictive features (X_(i)) to apply to GBM based on thequarter of interest and attributes of the media assets included in thecorresponding programming schedule 175. For example, the model builder150 may build a first projection model by applying all availablehistorical information (e.g., historical ratings for media broadcast atthe same time and day of week, historical ratings for related media,etc.), social media indicators (e.g., number of social media messagesposted referencing media of interest, number of unique authors postingsocial media messages referencing media of interest, etc.), etc. to theGBM. The example model builder 150 may build a second projection modelby applying a subset of the historical information available to the GBM.For example, the model builder 150 may exclude historical informationequivalent to the gap of interest (e.g., the number of quarters betweenthe current quarter and the quarter of interest).

In some such instances, while the general technique of GBM is used tobuild projection models, the predictive features included in thecorresponding models is different and, as a result, the ensemble ofprediction models differ between the two projection models. While theillustrated example associates near-term projection models with one ortwo quarters in the future and associates the upfront projection modelswith three or more quarters in the futures, other time periods (e.g.,“gaps”) may additionally or alternatively be used. The example modelbuilder 150 of FIG. 1 stores the generated projection models in theexample models data store 155.

In the illustrated example of FIG. 1, the example central facility 125includes the example models data store 155 to store projection modelsgenerated by the example model builder 150. The example models datastore 155 may be implemented by a volatile memory (e.g., SDRAM, DRAM,RDRAM, etc.) and/or a non-volatile memory (e.g., flash memory). Theexample models data store 155 may additionally or alternatively beimplemented by one or more DDR memories, such as DDR, DDR2, DDR3, mDDR,etc. The example models data store 155 may additionally or alternativelybe implemented by one or more mass storage devices such as hard diskdrive(s), compact disk drive(s), digital versatile disk drive(s), etc.While in the illustrated example the models data store 155 isillustrated as a single database, the models data store 155 may beimplemented by any number and/or type(s) of databases.

In the illustrated example of FIG. 1, the central facility 125 includesthe example future ratings projector 160 to use the projection modelsgenerated by the example model builder 150 to project ratings for futurebroadcasts of media. For example, the future ratings projector 160 mayapply data related to a media asset of interest to predict viewership ofthe media asset of interest in three quarters from the current quarter.In the illustrated example, the future ratings projector 160 usesprogram characteristics of the media asset of interest and the quarterof interest to select a projection model to apply. For example, thefuture ratings projector 160 may determine a projection model based onthe gap of interest and the amount of future information available forthe media asset of interest.

In the illustrated example of FIG. 1, in response to selecting theprojection model to apply, the example future ratings projector 160retrieves data related to the media asset of interest from thepredictive features data store 145.

The example future ratings projector 160 of the illustrated example ofFIG. 1 applies the data related to a media asset of interest to thegenerated projection models stored in the example models data store 155to generate reports 165 predicting the ratings for a future broadcast ofthe media asset of interest. For example, the future ratings projector160 may estimate the ratings for a media asset of interest by applyingprogram attributes information, social media indicators informationand/or media performance information to a projection model. As usedherein, program attributes information includes genre information of themedia asset of interest, media type information (e.g., a series, aspecial, a repeat, a premiere, a new episode, etc.), day-of-weekinformation related to the media asset of interest, broadcast timerelated to the media asset of interest, originator (e.g., network orchannel) information related to the media asset of interest, etc. Asused herein, social media indicators information includes a social mediamessages count related to the number of media-exposure social mediamessages of interest, a social media unique authors count related to thenumber of unique authors who posted media-exposure social media messagesof interest, a social media impressions count related to the number ofusers who were exposed to the media-exposure social media messages ofinterest, etc. As used herein, media performance information includesratings associated with the media asset of interest (e.g., historicalratings associated with the media asset of interest), a day and time ofbroadcast (e.g., Tuesdays at 8:00 pm), etc.

FIG. 2 is a portion of an example upfront programming schedule 200 thatmay be used by the example central facility 125 of FIG. 1 to forecastratings for media broadcast during a corresponding future quarter. Inthe illustrated example, the upfront programming schedule 200 isprovided by the client 170 when requesting the future broadcast ratingprojections. The example upfront programming schedule 200 includes dayof week and broadcast times of different media including primetime media205 and daytime media 210. The example upfront programming schedule 200also includes scheduled broadcast times of special media 215.

In the illustrated example of FIG. 2, the primetime media 205 isassociated with television series that run on a repeating basis. Forexample, a sitcom that airs on a weekly basis is a series. In theillustrated example, a series episode may be a premiere episode (e.g., afirst episode of a season), a new episode (e.g., a first time that theparticular episode is broadcast) or a repeat episode. As describedbelow, projecting the ratings for a broadcast of a series in a futurequarter is advantageous because additional information is known. Forexample, historical series performance information may be utilized whenforming the projections. In addition, future programming information isknown about the series. For example, a series that is a comedy will tendto still be a comedy in a future quarter.

In the illustrated example of FIG. 2, the daytime media 210 isassociated with little or no future programming informationavailability. As described below, when media is classified as daytimemedia or no future programming information is available for the mediaasset, then the example central facility 125 of FIG. 1 utilizeshistorical program characteristics for particular days of the week andbroadcast times when projecting future ratings.

In the illustrated example of FIG. 2, the special media 215 isassociated with one-time events such as movies, sporting events,marathons (e.g., ten back-to-back episodes of a series, etc.), etc.Similar to daytime media, special media 215 does not have past serieshistorical performance information. For example, in the illustratedexample of FIG. 2, the special “Life After It Exploded” is a movie thatwill be broadcast two times in the fourth quarter of 2015 (e.g., at22:00 and then at 01:00 on Oct. 13, 2015. In such instances, pasthistorical ratings for the media asset (e.g., the special “Life After ItExploded”) are not available and/or are not reliable predictors forfuture ratings projections. However, in the illustrated example, thecentral facility 125 utilizes program characteristics such as genre andwhether the special media 215 is a movie, a special, etc., to projectfuture ratings.

FIG. 3 is an example data table 300 that lists raw audience measurementdata variables that the example data interface 130 of FIG. 1 may storein the example raw data database 135 of FIG. 1. In the illustratedexample of FIG. 3, the raw audience measurement data variables representthe data collected and/or provided by the audience measurement system(s)105 of FIG. 1 and/or the client 170 of FIG. 1. For example, the rawaudience measurement data variables may include the panelist mediameasurement data 110A collected via, for example, people metersoperating in statistically-selected households, set-top boxes and/orother media devices (e.g., such as digital video recorders, personalcomputers, tablet computers, smartphones, etc.) capable of monitoringand returning monitored data for media presentations, etc. The exampleraw audience measurement data variables included in the data table 300may also include the social media activity data 110B associated withmedia of interest referenced by social media messages collected via, forexample, social media servers that provide social media services tousers of the social media server. In some examples, the raw audiencemeasurement data may also include the programming schedule 175 andadditional client-provided data, such as the amount of money and/orresources the client anticipates spending on promoting the media assetsfor the quarter.

The example data table 300 of the illustrated example of FIG. 3 includesa variable name identifier column 305, a variable data type identifiercolumn 310 and a variable meaning identifier column 315. The examplevariable name identifier column 305 indicates example variables that maybe associated with a telecast and/or may be useful for projecting mediaratings. The example variable data type identifier column 310 indicatesa data type of the corresponding variable. The example variable meaningidentifier column 315 provides a brief description of the valueassociated with the corresponding variable. While three example variableidentifier columns are represented in the example data table 300 of FIG.3, more or fewer variable identifier columns may be represented in theexample data table 300. For example, the example data table 300 mayadditionally or alternatively include a variable identifier columnindicative of the source of the corresponding data (e.g., the examplepanelist media measurement data 110A, the example social media activitydata 110B, the client 170, etc.).

The example data table 300 of the illustrated example of FIG. 3 includessixteen example rows corresponding to example raw audience measurementdata variables. The example first block of rows 350 identifiesattributes and/or characteristics of a media asset and is stored asstrings. For example, the “Title” variable identifies the name of themedia asset (e.g., “Sports Stuff”), the “Type Identifier” variableidentifies the media type of the media asset (e.g., a “series,” a“movie,” etc.), the “Day of Week” variable identifies the day of theweek that the media asset was broadcast (e.g., “Tuesday”), the“Broadcast Time” variable identifies the time during which the mediaasset was broadcast (e.g., “20:00-20:30”), the “Network” variableidentifies on which network the media asset was broadcast (e.g., Channel“ABC”), and the “Genre” variable identifies the genre that the mediaasset is classified (e.g., a “comedy”).

In the example data table 300 of FIG. 3, the second example block ofrows 355 identifies ratings information associated with a media assetand the corresponding information is stored as floating type data. Forexample, the “Media Ratings” variable identifies the program ratingsassociated with the broadcast of the program (e.g., “1.01”). In theillustrated example, the ratings correspond to the viewership during theoriginal broadcast of the program and also include time-shiftedincremental viewing that takes place via, for example, a DVR orvideo-on-demand (VOD) service during the following 7 days (e.g.,“live+7” ratings). In some examples, the data table 300 includes ratingsinformation for specific time and days of the week. For example,telecast-level ratings measure viewership at, for example, one-minuteperiods. In such instances, the “DayTime ratings” variable representsthe number of people who were tuned to a particular channel at aparticular minute. For example, a first “DayTime ratings” value mayrepresent the number of people who were watching channel “ABC” between“20:00 and 20:01” on “Tuesday,” and a second “DayTime ratings” value mayrepresent the number of people who were watching channel “ABC” between“20:01 and 20:02” on “Tuesday.” Although the example data table 300includes “live+7” ratings, other ratings may additionally oralternatively be used. For example, the ratings information in the datatable 300 may include “live” ratings, “live+same day” ratings (e.g.,ratings that represent the number of people who viewed the media assetduring its original broadcast time and/or during the same day as theoriginal broadcast), “C3” ratings (e.g., ratings (sometimes presented asa percentage) that represent the number of people who viewed acommercial spot during its original broadcast time and/or within thefollowing three days of the original broadcast), “C3” impressions, etc.

In the illustrated example of FIG. 3, the example row 365 indicates the“Panelist ID” variable is stored as a string and uniquely identifies thepanelist who provided the viewership information. For example, panelistswho are provided people meters may be assigned a panelist identifier tomonitor the media exposure of the panelist. In the illustrated example,the panelist identifier (ID) is an obfuscated alphanumeric string toprotect the identity of the panelist. In some examples, the panelistidentifier is obfuscated in a manner so that the same obfuscatedpanelist identifier information corresponds to the same panelist. Inthis manner, user activities may be monitored for particular userswithout exposing sensitive information regarding the panelist. However,any other approach to protecting the privacy of a panelist mayadditionally or alternatively be used. In some examples, the panelistidentifier is used to identify demographic information associated withthe panelist. For example, the panelist identifier “0123” may link todemographic information indicating the panelist is a male, age 19-49.

In the example data table 300 of FIG. 3, the third example block of rows370 identifies information regarding social media messages. For example,the “Message ID” variable is stored as a floating data type and is aunique identifier of a social media message. In the illustrated example,the example “Message Timestamp” variable is stored as a string data typeand identifies the date and/or time when the corresponding social mediamessage was posted. In the illustrated example, the example “MessageContent” variable is stored as a string data type and identifies thecontent of corresponding social media message. In the illustratedexample, the example “Message Author” variable is stored as a stringdata type and identifies the author of the corresponding social mediamessage.

In the example data table 300 of FIG. 3, the fourth example block ofrows 375 represents different vehicles of advertising and represent theamount of money and/or resources that are allocated to advertising themedia asset via the corresponding vehicle. In the illustrated example,the advertising spending amounts are stored as floating values. Althoughthe example data table 300 includes three different vehicles foradvertisement spending, any other number of advertising vehicles and/orvehicle types may additionally or alternatively be used. Furthermore, insome instances, the advertisement spending variable may not be granular(e.g., not indicating separate vehicles), but rather represent a totalamount that the client 170 anticipates spending in advertising for themedia asset.

While sixteen example raw data variables are represented in the exampledata table 300 of FIG. 3, more or fewer raw data variables may berepresented in the example data table 300 corresponding to the many rawaudience measurement data variables that may be collected and/orprovided by the audience measurement system(s) 105 of FIG. 1 and/or theclient 170 of FIG. 1.

FIG. 4 is a block diagram of an example implementation of the datatransformer 140 of FIG. 1 that may facilitate manipulating and/ormodifying raw audience measurement data 110 retrieved from the exampleraw data database 135. As described above, the example data transformer140 transforms the raw information stored in the example raw datadatabase 135 to a form that may be meaningfully handled by the examplemodel builder 150 to generate one or more projection model(s). Theexample data transformer 140 of FIG. 4 includes an example ratingshandler 405, an example attributes handler 410, an example social mediahandler 415, an example spending handler 420 and an example universehandler 425. In the illustrated example, the ratings handler 405, theattributes handler 410, the social media handler 415, the spendinghandler 420 and the universe handler 425 record the transformedinformation in the example predictive features data store 145 of FIG. 1.

In the illustrated example of FIG. 4, the example data transformer 140includes the example ratings handler 405 to process ratings-relatedinformation representative of media assets. For example, the ratingshandler 405 may query and/or retrieve ratings-related information fromthe raw data database 135 (e.g., current ratings information, historicalratings information, etc.) and transform the retrieved ratings-relatedinformation into a form meaningfully handled by the example modelbuilder 150 and/or the example future ratings projector 160.

An example data table 500 of the illustrated example of FIG. 5illustrates example ratings predictive features that may be recorded bythe ratings handler 405 in the example predictive features data store145. The example data table 500 of the illustrated example of FIG. 5includes a feature name identifier column 505, a feature data typeidentifier column 510 and a feature meaning identifier column 515. Theexample feature name identifier column 505 indicates example predictivefeatures that may be associated with a media asset broadcast and/oruseful for projecting ratings for future broadcasts of the media asset.The example feature data type identifier column 510 indicates a datatype of the corresponding predictive feature. The example featuremeaning identifier column 515 provides a brief description of the valueassociated with the corresponding predictive feature. While threeexample feature identifier columns are represented in the example datatable 500 of FIG. 5, more or fewer feature identifier columns may berepresented in the example data table 500.

The example data table 500 of the illustrated example of FIG. 5 includesfive example rows corresponding to example ratings-related predictivefeatures. The example first row 550 indicates the ratings handler 405 ofFIG. 4 stores the “Hour Ratings” feature as a floating data type. In theillustrated example, the ratings handler 405 determines an “Hour Rating”value based on the “DayTime Ratings” variable retrieved from the exampleraw data database 135. For example, the ratings handler 405 may querythe raw data database 135 for the “DayTime Rating” values starting at atime (e.g., “20:00”) and for a day of the week (“e.g., “Tuesday”). Inthe illustrated example, the ratings handler 405 calculates a rating forthe corresponding day and time and records the logarithm transformationof the calculated rating as the “Hour Rating” for an hour-long periodstarting at the time and day of the week in the example predictivefeatures data store 145.

In the illustrated example, the second example row 555 indicates theratings handler 405 determines a “Series Ratings” value associated witha media asset of interest based on the “Media Ratings” variableretrieved from the example raw data database 135. For example, theratings handler 405 may query the raw data database 135 for the “MediaRatings” rating related to a media asset of interest (e.g., “SportsStuff”). In some examples, the media asset of interest is media includedin, for example, the example programming schedule 175 of FIG. 1. In someexamples, the media asset of interest may be media identified by themedia mapper 137 as related media. In the illustrated example, theratings handler 405 calculates an average ratings for the media asset ofinterest based on the historical ratings for the program and records thelogarithm transformation of the average ratings as the “Series Ratings”of the media asset of interest in the example predictive features datastore 145. However, other techniques for calculating historical ratingsfor a media asset (e.g., a series) may additionally or alternative beused. For example, the ratings handler 405 may calculate average ratingsfor media asset on an episode-by-episode basis. For example, the ratingshandler 405 may retrieve all historical ratings for the second episodeof Quarter 2 and calculate a media ratings value for the second episodeof the media asset.

In the illustrated example, the third example row 560 indicates theratings handler 405 determines a “Genre Rating” value based on the“Media Ratings” variable and the “Genre” variable retrieved from theexample raw data database 135. For example, the ratings handler 405 mayuse the “Genre” variable to query the raw data database 135 for the“Media Rating” values for media assets classified by the genre. In theillustrated example, the ratings handler 405 calculates an averagerating for the genre and records the logarithm transformation of thecalculated average as the “Genre Rating” in the example predictivefeatures data store 145.

While the example data table 500 of FIG. 5 includes three examplehistorical ratings features, any other number of historical ratings mayadditionally or alternatively be used.

In the illustrated example of FIG. 4, the example data transformer 140includes the example attributes handler 510 to process attributes and/orcharacteristics representative of media assets. For example, theattributes handler 510 may query and/or retrieve program attributesinformation from the raw data database 135 (e.g., genre-identifyinginformation, day-of-week information, broadcast time-identifyinginformation, etc.) and transform the retrieved program attributesinformation into a form meaningfully handled by the example modelbuilder 150 and/or the example future ratings projector 160.

An example data table 600 of the illustrated example of FIG. 6illustrates example program attributes predictive features that may berecorded by the attributes handler 510 in the example predictivefeatures data store 145. The example data table 600 of the illustratedexample of FIG. 6 includes a feature name identifier column 605, afeature data type identifier column 610 and a feature meaning identifiercolumn 615. The example feature name identifier column 605 indicatesexample predictive features that may be associated with a media assetand/or useful for projecting ratings for future broadcasts of the mediaasset. The example feature data type identifier column 610 indicates adata type of the corresponding predictive feature. The example featuremeaning identifier column 615 provides a brief description of the valueassociated with the corresponding predictive feature. While threeexample feature identifier columns are represented in the example datatable 600 of FIG. 6, more or fewer feature identifier columns may berepresented in the example data table 600.

The example data table 600 of the illustrated example of FIG. 6 includesfourteen example rows corresponding to example transformed programattributes predictive features. In the illustrated example, the exampleprogram attributes predictive features of the data table 600 representsix example characteristics of a media asset. The first example block ofrows 650 indicates that the example attributes handler 410 storesday-of-week information as Boolean features. In the illustrated example,the attributes handler 410 translates day-of-week information that isstored as a string data type at the raw data database 135 to one or moreday-of-week Boolean features. For example, the attributes handler 410may retrieve day-of-week information related to a media asset indicatingthe date of the week that the media asset is broadcast (e.g., “Tuesday”)and set the corresponding day-of-week Boolean feature to true (e.g.,“1”) and set (or reset) other day-of-week Boolean features to false(e.g., “0”). In the illustrated example, in response to determining thatthe raw day-of-week information indicates the media asset is broadcaston a “Tuesday,” the example attributes handler 410 sets the value of thecorresponding “Day Tues” feature to true (e.g., “1”) and sets (orresets) the values of the other day-of-week Boolean features (e.g., “DayMon,” . . . “Day SatSun”) to false (e.g., “0”). Although the exampleday-of-week information is represented as six example Boolean featuresin the example data table 600 of FIG. 6, any other number of Booleanfeatures may additionally or alternatively be used. For example, theattributes handler 410 may group the days-of-week information into aweekday Boolean feature (e.g., the day-of-week is “Monday,” “Tuesday,”“Wednesday,” “Thursday” or “Friday”) or a weekend (e.g., the day-of weekis “Saturday” or “Sunday”) Boolean feature.

The second example block of rows 655 of the data table 600 of FIG. 6indicates that the example attributes handler 410 storesgenre-identifying information as Boolean features. In the illustratedexample, the attributes handler 410 translates genre-identifyinginformation that is stored as a string data type at the raw datadatabase 135 to one or more genre-related Boolean features. For example,the attributes handler 410 may retrieve genre-identifying informationindicative of the genre classification of a media asset (e.g., adocumentary, drama, variety, comedy, etc.) and set the correspondinggenre-related Boolean feature to true (e.g., “1”) and set (or reset)other genre-related Boolean features to false (e.g., “0”). In theillustrated example, in response to determining that retrieved rawgenre-identifying information indicates the corresponding media asset isa “comedy,” the example attributes handler 410 sets the value of thecorresponding “Genre Comedy” feature to true (e.g., “1”) and sets (orresets) the values of the other genre-related Boolean features (e.g.,“Genre Documentary,” “Genre Drama” and “Genre Variety”) to false (e.g.,“0”). Although the example genre-identifying information is representedas three example genre-related Boolean features in the example datatable 600 of FIG. 6, any other number of Boolean features representativeof the genre of a media asset may additionally or alternatively be used.

The third example block of rows 660 of the data table 600 of FIG. 6indicates that the example attributes handler 410 storesoriginator-identifying information as Boolean features. In theillustrated example, the attributes handler 410 translatesoriginator-identifying information that is stored as a string data typeat the raw data database 135 to one or more originator-related Booleanfeatures. For example, the attributes handler 410 may retrieveoriginator-identifying information indicative of the network (orchannel) that broadcasts a media asset (e.g., channel “ABC,” channel“XYZ,” etc.) and set the corresponding originator-related Booleanfeature to true (e.g., “1”) and set (or reset) other originator-relatedBoolean features to false (e.g., “0”). In the illustrated example, inresponse to determining that retrieved raw originator-identifyinginformation indicates the corresponding media asset is broadcast onchannel “ABC,” the example attributes handler 410 sets the value of thecorresponding “Originator ABC” feature to true (e.g., “1”) and sets (orresets) the values of the other originator-related Boolean features(e.g., “Originator XYZ”) to false (e.g., “0”). While two exampleoriginators are represented in the example data table 600 of FIG. 6,more or fewer originators may be represented in the example data table600 corresponding to the many broadcast networks and cable networks thatbroadcast media assets.

The example row 665 of the data table 600 of FIG. 6 indicates that theexample attributes handler 410 stores broadcast time-identifyinginformation as an integer data type. In the illustrated example, theattributes handler 410 maps broadcast time-identifying information thatis stored as a string data type at the raw data database 135 to aninteger. For example, the attributes handler 410 may retrieve broadcasttime-identifying information indicative of when a media asset isbroadcast (e.g., “00:00-01:00,” “03:00-04:00,” . . . “23:00-00:00”) andset the “Hour Block” feature value based on a corresponding hour block.For example, the attributes handler 410 may map the broadcast time“00:00-01:00” to half-hour block “0,” may map the broadcast time“01:00-02:00” to half-hour block “1,” etc. Although the examplebroadcast time-identifying information is represented as hour blocks,any other granularity may additionally or alternatively be used. Forexample, the broadcast times may be based on quarter-hours, half-hours,etc.

The fourth example block of rows 660 of the data table 600 of FIG. 6indicates that the example attributes handler 410 stores media-typeidentifying information as Boolean features. In the illustrated example,the attributes handler 410 transforms media-type identifying informationthat is stored as a string data type at the raw data database 135 to oneor more media-related Boolean features. For example, the attributeshandler 410 may retrieve media-type identifying information indicativeof the whether the media asset is a series (e.g., a program thatregularly repeats) or a special (e.g., a one-time event such as a movie,a sporting event, a marathon of episodes, etc.) and set thecorresponding media-related Boolean feature to true (e.g., “1”) and set(or reset) other media-related Boolean features to false (e.g., “0”). Inthe illustrated example, the media-related Boolean features identifywhether the media asset is a series and a premiere, a new or repeatepisode of a series, or whether the media asset is a special and a movieor sports event. For example, in response to determining that retrievedraw media-type identifying information indicates the corresponding mediaasset is series premiere episode, the example attributes handler 410sets the value of the corresponding “Series Premiere” feature to true(e.g., “1”) and sets (or resets) the values of the other media-relatedBoolean features (e.g., “Series New,” “Series Repeat,” “Special Movie”or “Special Sports”) to false (e.g., “0”). While three example seriesmedia-types and two example specials media-types are represented in theexample data table 600 of FIG. 6, more or fewer originators may berepresented in the example data table 600 corresponding to the manymedia types of media assets.

While the example data table 600 of FIG. 6 includes five example programattributes related to a media asset (e.g., day-of-week, genre,originator and broadcast time), any other number of program attributesmay additionally or alternatively be used.

In the illustrated example of FIG. 4, the example data transformer 140includes the example social media handler 415 to process social mediamessages representative of media assets. For example, the social mediahandler 415 may query and/or retrieve social media messages and/orsocial media messages-related information from the raw data database 135(e.g., message identifiers, message timestamps, message content, messageauthors, etc.) and transform the retrieved social media messages and/orrelated information into a form meaningfully handled by the examplemodel builder 150 and/or the example future ratings projector 160.

An example data table 700 of the illustrated example of FIG. 7illustrates example social media data variables transformed into socialmedia predictive features that may be recorded by the social mediahandler 415 in the example predictive features data store 145. Theexample data table 700 of the illustrated example of FIG. 7 includes afeature name identifier column 705, a feature data type identifiercolumn 710 and a feature meaning identifier column 715. The examplefeature name identifier column 705 indicates example predictive featuresthat may be associated with a media asset broadcast and/or useful forprojecting ratings for future broadcasts of the media asset. The examplefeature data type identifier column 710 indicates a data type of thecorresponding predictive feature. The example feature meaning identifiercolumn 715 provides a brief description of the value associated with thecorresponding predictive feature. While three example feature identifiercolumns are represented in the example data table 700 of FIG. 7, more orfewer feature identifier columns may be represented in the example datatable 700.

The example data table 700 of the illustrated example of FIG. 7 includestwo example rows corresponding to example social media predictivefeatures. The first example row 750 indicates the social media handler415 of FIG. 4 stores an “SM Count” feature as a floating data type inthe example predictive features data store 145. In the illustratedexample, the social media handler 415 determines the “SM Count” value,or social media count value, associated with a media asset of interestbased on a number of posted social media messages of interest. Forexample, the social media handler 415 may inspect the social mediamessages returned by the raw data database 135 for social media messagesthat indicate exposure to a media asset. For example, a media asset maybe “Sports Stuff.” In such an example, a social media message ofinterest may include the text “Jon is my favorite character on SportsStuff!” and may include a message timestamp indicating that the socialmedia message was posted by the message author during broadcast of themedia asset. In the illustrated example, the social media handler 415may count the number of social media messages identified as of interestand record a logarithm transformation of the number of social mediamessages of interest (e.g., the social media messages that indicateexposure to a media asset) as the “SM Count” corresponding to the mediaasset of interest in the example predictive features data store 145.

The second example row 755 of the data table 700 of FIG. 7 indicatesthat the example social media handler 415 stores a value related to thenumber of unique authors who posted social media messages of interest asa floating data type in the example predictive features data store 145.The second example row 755 of the data table 700 of FIG. 7 indicates thesocial media handler 415 of FIG. 4 stores a “SM UAuthors” feature as afloating data type in the example translated data database 145. In theillustrated example, the social media handler 415 determines the “SMUAuthors” value, or social media unique authors value, associated with amedia asset of interest based on a number of unique authors who postedsocial media messages of interest. For example, the social media handler415 may inspect the social media messages returned by the raw datadatabase 135 for social media messages that indicate exposure to a mediaasset. In the illustrated example, the social media handler 415 maycount the number of unique authors who posted the social media messagesidentified as of interest and record a logarithm transformation of thenumber of unique authors as the “SM UAuthors” corresponding to the mediaasset of interest in the example translated data database 145.

In the illustrated example, the example social media handler 415inspects social media messages and/or social media messages-relatedinformation retrieved from the raw data database 135 and transform(s)the retrieved social media messages and/or related information into aform meaningfully handled by the example model builder 150 and/or theexample future ratings projector 160. In some examples, the raw audiencemeasurement data 110 may be provided as aggregated data. For example,rather than providing social media messages and/or social mediamessages-related information, the example audience measurement system(s)105 of FIG. 1 may count the number of posted social media messagesrelated to media assets of interest, may count the number of uniqueauthors who posted social media messages related to media assets ofinterest, etc., and provide the respective counts to the example centralfacility 125. In some such examples, the example social media handler415 may retrieve the respective counts and store the logarithmtransformation of the corresponding numbers as the respective socialmedia-related predictive features. However, any other technique may beused to determine the number of posted social media messages related tomedia assets of interest and/or the number of unique authors who postedsocial media messages related to media assets of interest.

While the example data table 700 of FIG. 7 includes two example socialmedia features, any other number of social media indicators mayadditionally or alternatively be used. For example, the example datatable 700 may include a count of the number of impressions associatedwith posted social media messages related to media assets of interest.

In the illustrated example of FIG. 4, the example data transformer 140includes the example spending handler 420 to process spending-relatedinformation representative of media assets. For example, the spendinghandler 420 may query and/or retrieve advertisement-spending variablesfrom the raw data database 135 and transform the retrievedadvertisement-spending variables into a form meaningfully handled by theexample model builder 150 and/or the example future ratings projector160.

An example data table 800 of the illustrated example of FIG. 8illustrates example advertisement-spending variables transformed intospending predictive features that may be recorded by the spendinghandler 420 in the example predictive features data store 145. Theexample data table 800 of the illustrated example of FIG. 8 includes afeature name identifier column 805, a feature data type identifiercolumn 810 and a feature meaning identifier column 815. The examplefeature name identifier column 805 indicates example predictive featuresthat may be associated with a media asset broadcast and/or useful forprojecting ratings for future broadcasts of the media asset. The examplefeature data type identifier column 810 indicates a data type of thecorresponding predictive feature. The example feature meaning identifiercolumn 815 provides a brief description of the value associated with thecorresponding predictive feature. While three example feature identifiercolumns are represented in the example data table 800 of FIG. 8, more orfewer feature identifier columns may be represented in the example datatable 800.

The first example block of rows 855 indicates that the example spendinghandler 420 stores advertisement-spending related information asfloating data types in the predictive features data store 145. In theillustrated example, the spending handler 420 retrieves the respectiveamounts and store the logarithm transformation of the correspondingamounts as the respective advertisement-spending features. However, anyother technique may be used to determine the amount of advertisementspending anticipated for the different advertisement vehicles. While theexample data table 800 of FIG. 8 includes seven example vehicles foradvertisement spending, any other number of advertisement vehicles mayadditionally or alternatively be used.

In the illustrated example, the example row 865 indicates that spendinghandler 420 determines a “Total Ad Spending” value based on thedifferent advertisement vehicles retrieved from the example raw datadatabase 135 (e.g., the fourth example block of rows 375 of FIG. 3). Forexample, the spending handler 420 may retrieve each of the differentadvertisement spending variables from the raw data database 135 and sumthe total amount anticipated to be spent on advertisements for thecorresponding media asset of interest. In the illustrated example, thespending handler 420 records the logarithm transformation of thecalculated total amount as the “Total Ad Spending” feature in theexample predictive features data store 145 as a floating data type.

In the illustrated example of FIG. 4, the example data transformer 140includes the example universe handler 425 to process universeestimates-related information representative of populations fordifferent demographic groupings. For example, the universe handler 450may query and/or retrieve population estimates from the raw datadatabase 135 and transform the retrieved population estimates into aform meaningfully handled by the example model builder 150 and/or theexample future ratings projector 160.

An example data table 900 of the illustrated example of FIG. 9illustrates example population estimates variables transformed intouniverse estimate features that may be recorded by the universe handler425 in the example predictive features data store 145. The example datatable 900 of the illustrated example of FIG. 9 includes a feature nameidentifier column 905, a feature data type identifier column 910 and afeature meaning identifier column 915. The example feature nameidentifier column 905 indicates example predictive features that may beassociated with a universe. The example feature data type identifiercolumn 910 indicates a data type of the corresponding predictivefeature. The example feature meaning identifier column 915 provides abrief description of the value associated with the correspondingpredictive feature. While three example feature identifier columns arerepresented in the example data table 900 of FIG. 9, more or fewerfeature identifier columns may be represented in the example data table900.

The example block of rows 950 of the example data table 900 indicatesthat the example universe handler 425 stores universe estimates asfloating data types in the predictive features data store 145. In theillustrated example, the universe handler 425 retrieves the respectiveuniverse counts and stores the logarithm transformation of thecorresponding amounts as the respective universe estimate features.However, any other technique may be used to determine the estimatednumber of actual households or people from which a sample is taken andto which data from the sample will be projected. While the example datatable 900 of FIG. 9 includes universe estimates for twelve exampledemographic groupings, any other number of demographic groupings mayadditionally or alternatively be used.

In the illustrated example, the example row 955 indicates that theuniverse handler 425 determines a “Total Households” value based on thedifferent demographic groupings retrieved from the example raw datadatabase 135. For example, the universe handler 425 may retrieve each ofthe different universe estimate variables from the raw data database 135and sum the total amount households or persons in the correspondingdemographic groupings. In the illustrated example, the universe handler425 records the logarithm transformation of the calculated total ofhouseholds or persons as the “Total Households” feature in the examplepredictive features data store 145 as a floating data type.

While an example manner of implementing the central facility 125 of FIG.1 is illustrated in FIG. 1, one or more of the elements, processesand/or devices illustrated in FIG. 1 may be combined, divided,re-arranged, omitted, eliminated and/or implemented in any other way.Further, the example data interface 130, the example raw data database135, the example media mapper 137, the example media catalog 139, theexample data transformer 140, the example predictive features data store145, the example model builder 150, the example models database 155, theexample future ratings projector 160 and/or, more generally, the examplecentral facility 125 of FIG. 1 may be implemented by hardware, software,firmware and/or any combination of hardware, software and/or firmware.Thus, for example, any of the example data interface 130, the exampleraw data database 135, the example media mapper 137, the example mediacatalog 139, the example data transformer 140, the example predictivefeatures data store 145, the example model builder 150, the examplemodels database 155, the example future ratings projector 160 and/or,more generally, the example central facility 125 of FIG. 1 could beimplemented by one or more analog or digital circuit(s), logic circuits,programmable processor(s), application specific integrated circuit(s)(ASIC(s)), programmable logic device(s) (PLD(s)) and/or fieldprogrammable logic device(s) (FPLD(s)). When reading any of theapparatus or system claims of this patent to cover a purely softwareand/or firmware implementation, at least one of the example datainterface 130, the example raw data database 135, the example mediamapper 137, the example media catalog 139, the example data transformer140, the example predictive features data store 145, the example modelbuilder 150, the example models database 155, the example future ratingsprojector 160 and/or, more generally, the example central facility 125of FIG. 1 is/are hereby expressly defined to include a tangible computerreadable storage device or storage disk such as a memory, a digitalversatile disk (DVD), a compact disk (CD), a Blu-ray disk, etc. storingthe software and/or firmware. Further still, the example centralfacility 125 of FIG. 1 may include one or more elements, processesand/or devices in addition to, or instead of, those illustrated in FIG.1, and/or may include more than one of any or all of the illustratedelements, processes and devices.

While an example manner of implementing the data transformer 140 of FIG.1 is illustrated in FIG. 4, one or more of the elements, processesand/or devices illustrated in FIG. 4 may be combined, divided,re-arranged, omitted, eliminated and/or implemented in any other way.Further, the example ratings handler 405, the example attributes handler410, the example social media handler 415, the example spending handler420, the example universe handler 425 and/or, more generally, theexample data transformer 140 of FIG. 4 may be implemented by hardware,software, firmware and/or any combination of hardware, software and/orfirmware. Thus, for example, any of the example ratings handler 405, theexample attributes handler 410, the example social media handler 415,the example spending handler 420, the example universe handler 425and/or, more generally, the example data transformer 140 of FIG. 4 couldbe implemented by one or more analog or digital circuit(s), logiccircuits, programmable processor(s), application specific integratedcircuit(s) (ASIC(s)), programmable logic device(s) (PLD(s)) and/or fieldprogrammable logic device(s) (FPLD(s)). When reading any of theapparatus or system claims of this patent to cover a purely softwareand/or firmware implementation, at least one of the example ratingshandler 405, the example attributes handler 410, the example socialmedia handler 415, the example spending handler 420, the exampleuniverse handler 425 and/or, more generally, the example datatransformer 140 of FIG. 4 is/are hereby expressly defined to include atangible computer readable storage device or storage disk such as amemory, a digital versatile disk (DVD), a compact disk (CD), a Blu-raydisk, etc. storing the software and/or firmware. Further still, theexample data transformer 140 of FIG. 1 may include one or more elements,processes and/or devices in addition to, or instead of, thoseillustrated in FIG. 4, and/or may include more than one of any or all ofthe illustrated elements, processes and devices.

Flowcharts representative of example machine readable instructions forimplementing the example central facility of FIG. 1 are shown in FIGS.10-16 and/or 17. In these examples, the machine readable instructionscomprise a program for execution by a processor such as the processor1912 shown in the example processor platform 1900 discussed below inconnection with FIG. 19. The program may be embodied in software storedon a tangible computer readable storage medium such as a CD-ROM, afloppy disk, a hard drive, a digital versatile disk (DVD), a Blu-raydisk, or a memory associated with the processor 1912, but the entireprogram and/or parts thereof could alternatively be executed by a deviceother than the processor 1912 and/or embodied in firmware or dedicatedhardware. Further, although the example program is described withreference to the flowcharts illustrated in FIGS. 10-16 and/or 17, manyother methods of implementing the example central facility 125 mayalternatively be used. For example, the order of execution of the blocksmay be changed, and/or some of the blocks described may be changed,eliminated, or combined.

As mentioned above, the example processes of FIGS. 10-16 and/or 17 maybe implemented using coded instructions (e.g., computer and/or machinereadable instructions) stored on a tangible computer readable storagemedium such as a hard disk drive, a flash memory, a read-only memory(ROM), a compact disk (CD), a digital versatile disk (DVD), a cache, arandom-access memory (RAM) and/or any other storage device or storagedisk in which information is stored for any duration (e.g., for extendedtime periods, permanently, for brief instances, for temporarilybuffering, and/or for caching of the information). As used herein, theterm tangible computer readable storage medium is expressly defined toinclude any type of computer readable storage device and/or storage diskand to exclude propagating signals and to exclude transmission media. Asused herein, “tangible computer readable storage medium” and “tangiblemachine readable storage medium” are used interchangeably. Additionallyor alternatively, the example processes of FIGS. 10-16 and/or 17 may beimplemented using coded instructions (e.g., computer and/or machinereadable instructions) stored on a non-transitory computer and/ormachine readable medium such as a hard disk drive, a flash memory, aread-only memory, a compact disk, a digital versatile disk, a cache, arandom-access memory and/or any other storage device or storage disk inwhich information is stored for any duration (e.g., for extended timeperiods, permanently, for brief instances, for temporarily buffering,and/or for caching of the information). As used herein, the termnon-transitory computer readable medium is expressly defined to includeany type of computer readable storage device and/or storage disk and toexclude propagating signals and to exclude transmission media. As usedherein, when the phrase “at least” is used as the transition term in apreamble of a claim, it is open-ended in the same manner as the term“comprising” is open ended. “Comprising” and all other variants of“comprise” are expressly defined to be open-ended terms. “Including” andall other variants of “include” are also defined to be open-ended terms.In contrast, the term “consisting” and/or other forms of “consist” aredefined to be close-ended terms.

FIG. 10 is a flowchart representative of example machine-readableinstructions 1000 that may be executed by the example central facility125 of FIG. 1 to project ratings for future broadcasts of media. Theexample instructions 1000 of FIG. 10 begin at block 1002 when theexample central facility 125 receives a request for ratings projectionsfor a future broadcast of media. For example, the client 170 may requestthe AME 120 project ratings for the example programming schedule 200 ofFIG. 2. The request may be to project ratings for a near-term quarter(e.g., a quarter that is one or two quarters in the future) or a requestto project ratings for an upfront quarter (e.g., a quarter that is threeor more quarters in the future).

At block 1004, the example central facility 125 obtains data related tothe request. For example, the central facility 125 may parse the rawdata database 135 (FIG. 1) to obtain data for building one or moreprojection model(s). In some examples, the example media mapper 137(FIG. 1) may identify media related to media assets included in theprogramming schedule 200. In some examples, the example data transformer140 (FIGS. 1 and/or 4) may transform raw data stored in the raw datadatabase 135 into a form meaningfully handled by the example modelbuilder 150 (FIG. 1) and/or the example future ratings projector 160(FIG. 1).

At block 1006, the example central facility 125 builds one or moreprojection model(s). For example, the model builder 150 may determine arelationship between predictive features stored in the predictivefeatures data store 145 (FIG. 1) and historical ratings. The examplemodel builder 150 stores the generated model(s) in the example modelsdata store 155 (FIG. 1). An example approach to build a projection modelis described below in connection with FIG. 17.

At block 1008, the example central facility 125 determines projectedratings for future broadcasts of media. For example, the example futureratings projector 160 may apply data related to a media asset ofinterest to a projection model to estimate ratings for a media assetbased on the programming schedule 200. An example approach to estimateratings for future broadcasts of media is described below in connectionwith FIG. 18. The example process 1000 of FIG. 10 ends.

While in the illustrated example, the example instructions 1000 of FIG.10 represent a single iteration of projecting ratings for futurebroadcasts of media, in practice, the example instructions 1000 of theillustrated example of FIG. 10 may be executed in parallel (e.g., inseparate threads) to allow the central facility 125 to handle multiplerequests for ratings projections at a time.

FIG. 11 is a flowchart representative of example machine-readableinstructions 1100 that may be executed by the example central facility125 of FIG. 1 to catalog related media. The example instructions 1100 ofFIG. 11 begin at block 1102 when the example central facility 125receives audience measurement data 110 from the example audiencemeasurement system(s) 105 of FIG. 1. For example, the example datainterface 130 (FIG. 1) may obtain and/or retrieve example panelist mediameasurement data 110A and/or example social media activity data 110Bperiodically and/or based on one or more events. In some examples, thedata interface 130 may obtain and/or receive an example programmingschedule 175 from the example client 170 periodically and/or based onone or more events. In some examples, the data interface 130 may obtain,retrieve and/or receive the example audience measurement data 110 and/orthe programming schedule 175 aperiodically and/or as a one-time event.The example data interface 130 stores the audience measurement data 110in the example raw data database 135 (FIG. 1).

At block 1104, the example central facility 125 indexes the audiencemeasurement data 110. For example, the example media mapper 137 (FIG. 1)may parse the raw data database 135 and identify media identifiersassociated with different media assets. At block 1106, the example mediamapper 137 identifies related media. For example, the media mapper 137may identify related media by comparing program names. In some examples,the media mapper 137 may identify related media by processing theprogram names for typographical errors (e.g., common typographicalerrors). In some examples, the media mapper 137 utilizes title names,broadcast day and times, media director(s), character name(s), actor andactress name(s), etc., to identify related media.

At block 1108, the example media mapper 137 records the media mappings.For example, the media mapper 137 may map a first media asset (e.g., afirst media asset name) to a second media asset (e.g., a second medianame) and store the media mapping in the example media catalog 139 (FIG.1). The example process 1100 of FIG. 11 then ends.

FIG. 12 is a flowchart representative of example machine-readableinstructions 1200 that may be executed by the example data transformer140 of FIGS. 1 and/or 4 to transform raw audience measurement data topredictive features. The example process 1200 of the illustrated exampleof FIG. 12 begins at block 1202 when the example data transformer 140obtains ratings-related information associated with media assets. Forexample, the data transformer 140 may retrieve and/or query “media”ratings, “DayTime” ratings and/or information representative of whethera panelist viewed a particular episode of a media asset from the exampleraw data database 135. At block 1204, the example ratings handler 405(FIG. 4) transforms the ratings-related information to ratingspredictive features for use by the example model builder 150 and/or theexample future ratings projector 160. In the illustrated example, theratings handler 405 transforms the ratings-related information inaccordance with the example ratings predictive features table 500 ofFIG. 5. At block 1206, the example ratings handler 405 determineswhether there is additional ratings-related information to transform.If, at block 1206, the ratings handler 405 determined that there isadditional ratings-related information to transform, control returns toblock 1202.

If, at block 1206, the ratings handler 405 determined that there is notadditional ratings-related information to transform, then, at block1208, the example data transformer 140 obtains program attributesinformation associated with media assets. For example, the exampleattributes handler 410 (FIG. 4) may retrieve and/or query the exampleraw data database 135 for day-of-week information, genre information,network information and/or broadcast time information. At block 1210,the example attributes handler 410 transforms the program attributesinformation to program attributes predictive features for use by theexample model builder 150 and/or the example future ratings projector160. In the illustrated example, the attributes handler 410 transformsthe program attributes information in accordance with the exampleprogram attributes predictive features table 600 of FIG. 6. At block1212, the example attributes handler 410 determines whether there isadditional program attributes information to transform. If, at block1212, the attributes handler 410 determined that there is additionalprogram attributes information to transform, control returns to block1208.

If, at block 1212, the attributes handler 410 determined that there isnot additional program attributes information to transform, then, atblock 1214, the example data transformer 140 obtains social mediamessages-related information associated with media assets. For example,the example social media handler 415 may retrieve and/or query theexample raw data database 135 for a number of posted social mediamessages of interest and/or a number of unique authors who posted socialmedia messages of interest. At block 1216, the example social mediahandler 415 transforms the social media messages-related information tosocial media predictive features for use by the example model builder150 and/or the example future ratings projector 160. In the illustratedexample, the social media handler 415 transforms the social mediamessages-related information in accordance with the example social mediapredictive features table 700 of FIG. 7. At block 1218, the examplesocial media handler 415 determines whether there is additional socialmedia messages-related information to transform. If, at block 1218, thesocial media handler 415 determined that there is additional socialmedia messages-related information to transform, control returns toblock 1214.

If, at block 1218, the social media handler 415 determined that there isnot additional social media information to transform, then, at block1220, the example data transformer 140 obtains spending-relatedinformation associated with media assets. For example, the examplespending handler 420 may retrieve and/or query the example raw datadatabase 135 for anticipated amounts (e.g., in money or resources)associated with different advertising vehicles. At block 1222, theexample spending handler 420 transforms the spending-related informationto advertisement spending predictive features for use by the examplemodel builder 150 and/or the example future ratings projector 160. Inthe illustrated example, the spending handler 420 transforms thespending-related information in accordance with the exampleadvertisement spending predictive features table 800 of FIG. 8. At block1224, the example spending handler 420 determines whether there isadditional spending-related information to transform. If, at block 1224,the spending handler 420 determined that there is additionalspending-related information to transform, control returns to block1220.

If, at block 1224, the spending handler 420 determined that there is notadditional spending-related information to transform, then, at block1226, the example data transformer 140 obtains universe-estimatesinformation associated with media assets. For example, the exampleuniverse handler 425 may retrieve and/or query the example raw datadatabase 135 for the number of households and/or people associated withdifferent demographic groupings. At block 1228, the example universehandler 425 transforms the universe estimates-related information touniverse estimates predictive features for use by the example modelbuilder 150 and/or the example future ratings projector 160. In theillustrated example, the universe handler 425 transforms the universeestimates-related information in accordance with the example universeestimates predictive features table 900 of FIG. 9. At block 1230, theexample universe handler 425 determines whether there is additionaluniverse estimates-related information to transform. If, at block 1230,the universe handler 425 determined that there is additional universeestimates-related information to transform, control returns to block1226.

If, at block 1230, the universe handler 425 determined that there is notadditional population estimates-related information to transform, then,at block 1232, the example data transformer 140 determines whether tocontinue normalizing audience measurement data. If, at block 1232, theexample data transformer 140 determined to continue normalizing audiencemeasurement data, control returns to block 1202 to wait to obtainratings-related information for translating.

If, at block 1232, the example data transformer 140 determined not tocontinue normalizing audience measurement data, the example process 1200of FIG. 12 ends.

While in the illustrated example, the example instructions 1200 of FIG.12 represent a single iteration of normalizing audience measurementdata, in practice, the example instructions 1200 of the illustratedexample of FIG. 12 may be executed in parallel (e.g., in separatethreads) to allow the central facility 125 to handle multiple requestsfor normalizing audience measurement data at a time.

FIG. 13 is a flowchart representative of example machine-readableinstructions 1300 that may be executed by the example central facility125 of FIG. 1 to project ratings for future broadcasts of a media asset.The example process 1300 of the illustrated example of FIG. 13 begins atblock 1302 when the example central facility 125 determines the quarterof interest. If, at block 1302, the central facility 125 determined thatthe quarter of interest is within the next two quarters (block 1304),then, at block 1306, the central facility 125 determines to build anear-term projection model.

If, at block 1302, the central facility 125 determined that the quarterof interest is more than two quarters (e.g., three or more quarters)from the current quarter (block 1308), then, at block 1310, the centralfacility 125 determines to build an upfront projection model.

At block 1312, the central facility 125 determines the amount of futureprogramming information that is available and classifies the media assetaccordingly. For example, if, at block 1312, the central facility 125determined that a media asset of interest is a television series (e.g.,a regular series) (block 1314), then, at block 1316, the centralfacility 125 applies predictive features associated with a first module(Module 1) when building the projection model and predicting the futureratings for the media asset of interest. Example predictive featuresassociated with Module 1 are illustrated in example schema 1400 of FIG.14. The example process 1300 of FIG. 13 ends.

If, at block 1312, the central facility 125 determined that a mediaasset of interest is special programming (block 1318), then, at block1320, the central facility 125 applies predictive features associatedwith a second module (Module 2) when building the projection model andpredicting the future ratings for the media asset of interest. Examplepredictive features associated with Module 2 are illustrated in exampleschema 1500 of FIG. 15. The example process 1300 of FIG. 13 ends.

If, at block 1312, the central facility 125 determined that no futureprogramming information is available for the media asset of intersect(block 1322), then, at block 1324, the central facility 125 appliespredictive features associated with a third module (Module 3) whenbuilding the projection model and predicting the future ratings for themedia asset of interest. Example predictive features associated withModule 3 are illustrated in example schema 1600 of FIG. 16. The exampleprocess 1300 of FIG. 13 ends.

An example schema 1400 of the illustrated example of FIG. 14 illustratesexample sets of predictive features that are used when generatingprojection models and/or that are applied to a projection model whenprojecting ratings of future broadcasts for television series (Module1). The example schema 1400 indicates that audience measurement data 110may be obtained and/or retrieved from a National People Meter (NPM)database 1402, which includes, but is not limited to, client-providedprogram characteristics, panelist-provided demographic information andviewing behaviors of individual households (HH) via people meters. Inthe illustrated example, the data provided by the NPM database 1402 mayinclude TV ratings information 1404 and ratings and contentcharacteristics information 1406. In the illustrated example of FIG. 14,the TV ratings information 1404 includes historical audiencemeasurements, such as, but not limited to, day and time ratings, serieshistorical performance and corresponding information for relatedprograms (e.g., programs related by name, day and time and/or contentand network).

In the illustrated example of FIG. 14, the ratings and contentcharacteristics information 1406 includes program characteristics (e.g.,genre, originator, day of week, yearly quarter, hour block, etc.). Theexample ratings and content characteristics information 1406 of FIG. 14also includes an indication of whether a media asset is a premierepisode, a new episode or a repeat episode. The example ratings andcontent characteristics information 1406 also includes demographicinformation such as household, age, gender, etc.

The example schema 1400 of FIG. 14 also includes other client-providedinformation 1408 (e.g., advertisement spending), other audiencemeasurement system(s) information 1410 (e.g., universal estimates) andother third party information 1412 (e.g., social media indicators).

In the illustrated example, the schema 1400 indicates that theinformation provided by the data sources 1402, 1404, 1406, 1408, 1410,1412 is processed (e.g., transformed) into predictors (e.g., predictivefeatures). The predictors may be used by the central facility 125 ofFIG. 1 to generate projection models and/or to project ratings of futurebroadcasts for television series (Module 1).

An example schema 1500 of the illustrated example of FIG. 15 illustratesexample sets of predictive features that are used when generatingprojection models and/or that are applied to a projection model whenprojecting ratings of future broadcasts for special programming (Module2). The example schema 1500 indicates that audience measurement data 110may be obtained and/or retrieved from a National People Meter (NPM)database 1502, which includes, but is not limited to, client-providedprogram characteristics, panelist-provided demographic information andviewing behaviors of individual households (HH) via people meters. Inthe illustrated example, the data provided by the NPM database 1502 mayinclude TV ratings information 1504 and ratings and contentcharacteristics information 1506. In the illustrated example of FIG. 15,the TV ratings information 1504 includes historical audiencemeasurements, such as, but not limited to, day and time ratingsassociated with a media asset and corresponding information for relatedprograms (e.g., programs related by name, day and time and/or contentand network).

In the illustrated example of FIG. 15, the ratings and contentcharacteristics information 1506 includes program characteristics (e.g.,genre, originator, day of week, yearly quarter, hour block, etc.). Theexample ratings and content characteristics information 1506 of FIG. 15also includes an indication of whether a media asset is a special, amovie, etc. The example ratings and content characteristics information1606 of FIG. 16 also includes an indication of whether a media asset isa premier episode, a new episode or a repeat episode. The exampleratings and content characteristics information 1606 also includesdemographic information such as household, age, gender, etc.

The example schema 1500 of FIG. 15 also includes other client-providedinformation 1508 (e.g., advertisement spending), other audiencemeasurement system(s) information 1510 (e.g., universal estimates) andother third party information 1512 (e.g., social media indicators).

In the illustrated example, the schema 1500 indicates that theinformation provided by the data sources 1502, 1504, 1506, 1508, 1510,1512 is processed (e.g., transformed) into predictors (e.g., predictivefeatures). The predictors may be used by the central facility 125 ofFIG. 1 to generate projection models and/or to project ratings of futurebroadcasts for special programming (Module 2).

An example schema 1600 of the illustrated example of FIG. 16 illustratesexample sets of predictive features that are used when generatingprojection models and/or that are applied to a projection model whenprojecting ratings of future broadcasts for media with unknown futureprogramming information (Module 3). The example schema 1600 indicatesthat audience measurement data 110 may be obtained and/or retrieved froma National People Meter (NPM) database 1602, which includes, but is notlimited to, client-provided program characteristics, panelist-provideddemographic information and viewing behaviors of individual households(HH) via people meters. In the illustrated example, the data provided bythe NPM database 1602 may include TV ratings information 1604 andratings and content characteristics information 1606. In the illustratedexample of FIG. 16, the TV ratings information 1604 includes historicalaudience measurements, such as, but not limited to, day and time ratingsassociated with a media asset and corresponding information for relatedprograms (e.g., programs related by name, day and time and/or contentand network).

In the illustrated example of FIG. 16, the ratings and contentcharacteristics information 1606 includes program characteristics (e.g.,genre, originator, day of week, yearly quarter, hour block, etc.). Theexample ratings and content characteristics information 1606 of FIG. 16also includes an indication of whether a media asset is a special, amovie, a premiere episode, a repeat episode, a new episode, etc. Theexample ratings and content characteristics information 1606 of FIG. 16also includes an indication of whether a media asset is a premierepisode, a new episode or a repeat episode. The example ratings andcontent characteristics information 1606 also includes demographicinformation such as household, age, gender, etc.

The example schema 1600 of FIG. 16 also includes other client-providedinformation 1608 (e.g., advertisement spending), other audiencemeasurement system(s) information 1610 (e.g., universal estimates) andother third party information 1612 (e.g., social media indicators).

In the illustrated example, the schema 1600 indicates that theinformation provided by the data sources 1602, 1604, 1606, 1608, 1610,1612 is processed (e.g., transformed) into predictors (e.g., predictivefeatures). The predictors may be used by the central facility 125 ofFIG. 1 to generate projection models and/or to project ratings of futurebroadcasts for media assets with unknown future programming information(Module 3).

FIG. 17 is a flowchart representative of example machine-readableinstructions 1700 that may be executed by the example model builder 150of FIG. 1 to build a ratings projection model. The example process 1700of the illustrated example of FIG. 17 begins at block 1702 when theexample model builder 150 selects a projection model to build based onthe quarter of interest. For example, the model builder 150 may selectan upfront projection model when the quarter of interest is three ormore quarters in the future and select the near-term projection modelwhen the quarter of interest is one or two quarters in the future.

At block 1704, the example model builder 150 selects a demographicgrouping associated with the projection model. For example, the modelbuilder 150 may generate a plurality of projection models correspondingto different demographic segments.

At block 1706, the example model builder 150 obtains historical datastored in the example predictive features data store 145 (FIG. 1) basedon the quarter of interest and the selected demographic segment. In theillustrated example, the model builder 150 obtains historical data fromthe predictive features data store 145 corresponding to the eightprevious quarters from the quarter of interest. For example, if thequarter of interest is the first quarter of 2016, then the model builder150 retrieves historical data from the predictive features data store145 corresponding to the four quarters of 2015 and the four quarters of2014.

At block 1708, the example model builder 150 determines whether toexclude a subset of the obtained historical data based on the selectedmodel. For example, if, at block 1708, the model builder 150 determinedthat the model builder 150 is building an upfront projection model,then, at block 1710, the model builder excludes historical datacorresponding to the gap between the current quarter and the quarter ofinterest. For example, if the quarter of interest is three quarters inthe future (e.g., Q+3), then the gap is two quarters and historical datafrom the two previous quarters (e.g., Q+1 and Q+2) is excluded whentraining the model.

At block 1712, the model builder 150 generates a projection model. Forexample, the model builder 150 may determine a relationship between theincluded historical data and measured ratings to generate the projectionmodel. For example, the model builder 150 may use any appropriateregression model, time-series model, etc. to represent the relationshipbetween the included historical data and the measured ratings. In someexamples, the model builder 150 trains and validates the parameters ofthe generated projection model by holding-out a subset of the data. Forexample, the model builder 150 may hold-out 30% of the includedhistorical data and train the projection model using the remaining 70%of the historical. The model builder 150 may then use the hold-out datato validate (e.g., test) the projection model.

At block 1714, the model builder 150 determines whether the generatedprojection model satisfies a correlation threshold. For example, if themeasured error between the actual ratings and the predicted ratings doesnot satisfy the correlation threshold, then control returns to block1712 to perform additional training and testing iterations.

If, at block 1714, the model builder 150 determined that the measurederror does satisfy the correlation threshold, then, at block 1716, themodel builder 150 records the generated projection model in the modelsdata store 155.

At block 1718, the example model builder 150 determines whether there isanother demographic grouping to process for the selected projectionmodel. If, at 1718, the model builder 150 determined that there isanother demographic grouping to process, then control returns to block1704 to select a demographic grouping to process.

FIG. 18 is a flowchart representative of example machine-readableinstructions 1800 that may be executed by the example future ratingsprojector 160 of FIG. 1 to project ratings for future broadcasts of amedia asset. The example process 1800 of the illustrated example of FIG.18 begins at block 1802 when the example future ratings projector 160determines whether the media asset of interest is a television series.For example, the future ratings projector 160 may retrieve the programcharacteristics of the media asset from the predictive features datastore 145 (FIG. 1). If, at block 1802, the future ratings projector 160determined that the media asset of interest is a television series,then, at block 1804, the future ratings projector 160 selects Module 1to project the future media ratings of the media asset of interest. Atblock 1806, the future ratings projector 160 obtains the predictivefeatures for the quarter of interest from the predictive features datastore 145 based on Module 1. In some examples, the future ratingsprojector 160 may consult the example schema 1400 of the illustratedexample of FIG. 14 to determine the predicted features associated withModule 1. Control then proceeds to block 1816 to apply the predictivefeatures to the media asset of interest.

If, at block 1802, the future ratings projector 160 determined that themedia asset of interest is not a television series, then, at block 1808,the future ratings projector 160 determines whether the media asset ofinterest is a special. For example, the future ratings projector 160 mayretrieve the program attributes predictive features from the predictivefeatures data store 145. If, at block 1808, the future ratings projector160 determined that the media asset of interest is a special, then, atblock 1810, the future ratings projector 160 selects Module 2 to projectthe future media ratings of the media asset of interest. At block 1812,the future ratings projector 160 obtains the predictive features for thequarter of interest from the predictive features data store 145 based onModule 2. In some examples, the future ratings projector 160 may consultthe example schema 1500 of the illustrated example of FIG. 15 todetermine the predicted features associated with Module 2. Control thenproceeds to block 1816 to apply the predictive features to the mediaasset of interest.

If, at block 1808, the future ratings projector 160 determined that themedia asset of interest is not a special, then, at block 1814, thefuture ratings projector 160 obtains the predictive features for thequarter of interest from the predictive features data store 145 based onModule 3. In some examples, the future ratings projector 160 may consultthe example schema 1600 of the illustrated example of FIG. 16 todetermine the predicted features associated with Module 3.

At block 1816, the example future ratings projector 160 applies theobtained predictive features for the quarter of interest to the selectedprojection module. At block 1818, the example program ratings estimator160 determines whether there is another media asset of interest toprocess. If, at block 1818, the example future ratings projector 160determined that there is another media asset of interest to process,then control returns to block 1802 to determine whether the media assetof interest is a television series.

If, at block 1818, the example future ratings projector 160 determinedthat there is not another media asset of interest to process, then, atblock 1820, the future ratings projector 160 generates a report. Forexample, the future ratings projector 160 may generate a reportincluding the projected ratings of the one or more media asset(s) ofinterest. In some examples, the future ratings projector 160 maygenerate a tool that can be used by client 170 to generate the report.The example process 1800 of FIG. 18 ends.

FIG. 19 is a block diagram of an example processor platform 1900 capableof executing the instructions of FIGS. 10-16 and/or 17 to implement thecentral facility 125 of FIG. 1 and/or the data transformer 140 of FIGS.1 and/or 4. The processor platform 1900 can be, for example, a server, apersonal computer, or any other type of computing device.

The processor platform 1900 of the illustrated example includes aprocessor 1912. The processor 1912 of the illustrated example ishardware. For example, the processor 1912 can be implemented by one ormore integrated circuits, logic circuits, microprocessors or controllersfrom any desired family or manufacturer.

The processor 1912 of the illustrated example includes a local memory1913 (e.g., a cache). The processor 1912 of the illustrated exampleexecutes the instructions to implement the example data interface 130,the example media mapper 137, the example data transformer 140, theexample model builder 150, the example future ratings projector 160, theexample ratings handler 405, the example attributes handler 410, theexample social media handler 415, the example spending handler 420 andthe example universe handler 425. The processor 1912 of the illustratedexample is in communication with a main memory including a volatilememory 194 and a non-volatile memory 1916 via a bus 1918. The volatilememory 1914 may be implemented by Synchronous Dynamic Random AccessMemory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS DynamicRandom Access Memory (RDRAM) and/or any other type of random accessmemory device. The non-volatile memory 1916 may be implemented by flashmemory and/or any other desired type of memory device. Access to themain memory 1914, 1916 is controlled by a memory controller.

The processor platform 1900 of the illustrated example also includes aninterface circuit 1920. The interface circuit 1920 may be implemented byany type of interface standard, such as an Ethernet interface, auniversal serial bus (USB), and/or a PCI express interface.

In the illustrated example, one or more input devices 1922 are connectedto the interface circuit 1920. The input device(s) 1922 permit(s) a userto enter data and commands into the processor 1912. The input device(s)can be implemented by, for example, an audio sensor, a microphone, acamera (still or video), a keyboard, a button, a mouse, a touchscreen, atrack-pad, a trackball, isopoint and/or a voice recognition system.

One or more output devices 1924 are also connected to the interfacecircuit 1920 of the illustrated example. The output devices 1924 can beimplemented, for example, by display devices (e.g., a light emittingdiode (LED), an organic light emitting diode (OLED), a liquid crystaldisplay, a cathode ray tube display (CRT), a touchscreen, a tactileoutput device, a printer and/or speakers). The interface circuit 1920 ofthe illustrated example, thus, typically includes a graphics drivercard, a graphics driver chip or a graphics driver processor.

The interface circuit 1920 of the illustrated example also includes acommunication device such as a transmitter, a receiver, a transceiver, amodem and/or network interface card to facilitate exchange of data withexternal machines (e.g., computing devices of any kind) via a network1926 (e.g., an Ethernet connection, a digital subscriber line (DSL), atelephone line, coaxial cable, a cellular telephone system, etc.).

The processor platform 1900 of the illustrated example also includes oneor more mass storage devices 1928 for storing software and/or data.Examples of such mass storage devices 1928 include floppy disk drives,hard drive disks, compact disk drives, Blu-ray disk drives, RAIDsystems, and digital versatile disk (DVD) drives. The example massstorage 1928 implements the example raw data database 135, the examplemedia catalog 139, the example predictive features data store 145 andthe example models data store 155.

The coded instructions 1932 of FIGS. 10-16 and/or 17 may be stored inthe mass storage device 1928, in the volatile memory 1914, in thenon-volatile memory 1916, and/or on a removable tangible computerreadable storage medium such as a CD or DVD.

From the foregoing, it will appreciate that the above disclosed methods,apparatus and articles of manufacture facilitate projecting ratings forfuture broadcasts of media. For example, disclosed examples includebuilding a projection model based on historical audience measurementdata and future quarters of interest. Examples disclosed herein may thenapply data related to the quarter of interest and media of interest toproject ratings for the media asset of interest.

Although certain example methods, apparatus and articles of manufacturehave been disclosed herein, the scope of coverage of this patent is notlimited thereto. On the contrary, this patent covers all methods,apparatus and articles of manufacture fairly falling within the scope ofthe claims of this patent.

What is claimed is:
 1. A ratings projection method comprising:normalizing, with a processor, audience measurement data correspondingto media exposure data, social media exposure data and programminginformation associated with a future quarter to determine normalizedaudience measurement data; classifying, with the processor, a mediaasset based on the programming information to determine a media assetclassification; building, with the processor, a projection model basedon a first subset of the normalized audience measurement data, the firstsubset of the normalized audience measurement data associated with afirst time frame relative to the future quarter, the first subset of thenormalized audience measurement data based on the media assetclassification; and applying, with the processor, the programminginformation to the projection model to project ratings for the mediaasset.
 2. The method as defined in claim 1, further includingclassifying the media asset as a television series when a characteristicof the media asset is indicative of at least one of a premier episode, arepeat episode or a new episode.
 3. The method as defined in claim 2,further including, in response to classifying the media asset as atelevision series, retrieving series historical performance informationrelated to the media asset.
 4. The method as defined in claim 1, furtherincluding classifying the media asset as special programming when acharacteristic of the media asset is indicative of at least one of amovie or a sporting event.
 5. The method as defined in claim 1, furtherincluding excluding a subset of the first subset of the normalizedaudience measurement data based on the future quarter.
 6. The method asdefined in claim 5, further including selecting a second subset of thenormalized audience measurement data to train the projection model, thesecond subset of the normalized audience measurement data included inthe first subset of the normalized audience measurement data and notincluded in the excluded subset of the first subset of the normalizedaudience measurement data.
 7. The method as defined in claim 1, whereinthe first subset of the normalized audience measurement data includeshistorical data related to the media asset and related to a subset ofmedia assets that are (1) included in the normalized audiencemeasurement data and (2) related to the media asset.
 8. A ratingsprojection comprising: a data transformer to normalize audiencemeasurement data corresponding to media exposure data, social mediaexposure data and programming information associated with a futurequarter to determine normalized audience measurement data; a modelbuilder to: classify a media asset based on the programming informationto determine a media asset classification; build a projection modelbased on a first subset of the normalized audience measurement data, thefirst subset of the normalized audience measurement data associated witha first time frame relative to the future quarter, the first subset ofthe normalized audience measurement data based on the media assetclassification; a ratings projector to apply the programming informationto the projection model to project ratings for the media asset.
 9. Theapparatus as defined in claim 8, wherein the model builder is toclassify the media asset as a television series when a characteristic ofthe media asset is indicative of at least one of a premier episode, arepeat episode or a new episode.
 10. The apparatus as defined in claim9, wherein the model builder retrieves series historical performanceinformation when the media asset is classified as a television series.11. The apparatus as defined in claim 8, wherein the model builder is toclassify the media asset as special programming when a characteristic ofthe media asset is indicative of at least one of a movie or a sportingevent.
 12. The apparatus as defined in claim 8, wherein the modelbuilder is to exclude a subset of the first subset of the normalizedaudience measurement data based on the future quarter.
 13. The apparatusas defined in claim 12, wherein the model builder is to select a secondsubset of the normalized audience measurement data to train theprojection model, the second subset of the normalized audiencemeasurement data included in the first subset of the normalized audiencemeasurement data and not included in the excluded subset of the firstsubset of the normalized audience measurement data.
 14. The apparatus asdefined in claim 8, wherein the first subset of the normalized audiencemeasurement data includes historical data related to the media asset andrelated to a subset of media assets that are (1) included in thenormalized audience measurement data and (2) related to the media asset.15. A tangible computer-readable storage medium comprising instructionsthat, when executed, cause a processor to at least: normalize audiencemeasurement data corresponding to media exposure data, social mediaexposure data and programming information associated with a futurequarter to determine normalized audience measurement data; classify amedia asset based on the programming information to determine a mediaasset classification; build a projection model based on a first subsetof the normalized audience measurement data, the first subset of thenormalized audience measurement data associated with a first time framerelative to the future quarter, the first subset of the normalizedaudience measurement data based on the media asset classification; andapply the programming information to the projection model to projectratings for the media asset.
 16. The tangible machine-readable storagemedium as defined in claim 15, wherein the instructions further causethe processor to: classify the media asset as a television series when acharacteristic of the media asset is indicative of at least one of apremier episode, a repeat episode or a new episode; and retrieve serieshistorical performance information related to the media asset when themedia asset is classified as a television series.
 17. The tangiblemachine-readable storage medium as defined in claim 15, wherein theinstructions further cause the processor to classify the media asset asspecial programming when a characteristic of the media asset isindicative of a movie or a sporting event.
 18. The tangiblemachine-readable storage medium as defined in claim 15, wherein theinstructions further cause the processor to exclude a subset of thefirst subset of the normalized audience measurement data based on thefuture quarter.
 19. The tangible machine-readable storage medium asdefined in claim 18, wherein the instructions further cause theprocessor to select a second subset of the normalized audiencemeasurement data to train the projection model, the second subset of thenormalized audience measurement data included in the first subset of thenormalized audience measurement data and not included in the excludedsubset of the first subset of the normalized audience measurement data.20. The tangible machine-readable storage medium as defined in claim 15,wherein the first subset of the normalized audience measurement dataincludes historical data related to the media asset and related to asubset of media assets that are (1) included in the normalized audiencemeasurement data and (2) related to the media asset.